949 resultados para logic tree, logicFS, Monte Carlo logic regression, genetic programming for association study, random forest, GENICA


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genetic research of complex diseases is a challenging, but exciting, area of research. The early development of the research was limited, however, until the completion of the Human Genome and HapMap projects, along with the reduction in the cost of genotyping, which paves the way for understanding the genetic composition of complex diseases. In this thesis, we focus on the statistical methods for two aspects of genetic research: phenotype definition for diseases with complex etiology and methods for identifying potentially associated Single Nucleotide Polymorphisms (SNPs) and SNP-SNP interactions. With regard to phenotype definition for diseases with complex etiology, we firstly investigated the effects of different statistical phenotyping approaches on the subsequent analysis. In light of the findings, and the difficulties in validating the estimated phenotype, we proposed two different methods for reconciling phenotypes of different models using Bayesian model averaging as a coherent mechanism for accounting for model uncertainty. In the second part of the thesis, the focus is turned to the methods for identifying associated SNPs and SNP interactions. We review the use of Bayesian logistic regression with variable selection for SNP identification and extended the model for detecting the interaction effects for population based case-control studies. In this part of study, we also develop a machine learning algorithm to cope with the large scale data analysis, namely modified Logic Regression with Genetic Program (MLR-GEP), which is then compared with the Bayesian model, Random Forests and other variants of logic regression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem of estimating the time-dependent statistical characteristics of a random dynamical system is studied under two different settings. In the first, the system dynamics is governed by a differential equation parameterized by a random parameter, while in the second, this is governed by a differential equation with an underlying parameter sequence characterized by a continuous time Markov chain. We propose, for the first time in the literature, stochastic approximation algorithms for estimating various time-dependent process characteristics of the system. In particular, we provide efficient estimators for quantities such as the mean, variance and distribution of the process at any given time as well as the joint distribution and the autocorrelation coefficient at different times. A novel aspect of our approach is that we assume that information on the parameter model (i.e., its distribution in the first case and transition probabilities of the Markov chain in the second) is not available in either case. This is unlike most other work in the literature that assumes availability of such information. Also, most of the prior work in the literature is geared towards analyzing the steady-state system behavior of the random dynamical system while our focus is on analyzing the time-dependent statistical characteristics which are in general difficult to obtain. We prove the almost sure convergence of our stochastic approximation scheme in each case to the true value of the quantity being estimated. We provide a general class of strongly consistent estimators for the aforementioned statistical quantities with regular sample average estimators being a specific instance of these. We also present an application of the proposed scheme on a widely used model in population biology. Numerical experiments in this framework show that the time-dependent process characteristics as obtained using our algorithm in each case exhibit excellent agreement with exact results. (C) 2010 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Regulatory authorities in many countries, in order to maintain an acceptable balance between appropriate customer service qualities and costs, are introducing a performance-based regulation. These regulations impose penalties, and in some cases rewards, which introduce a component of financial risk to an electric power utility due to the uncertainty associated with preserving a specific level of system reliability. In Brazil, for instance, one of the reliability indices receiving special attention by the utilities is the Maximum Continuous Interruption Duration per customer (MCID). This paper describes a chronological Monte Carlo simulation approach to evaluate probability distributions of reliability indices, including the MCID, and the corresponding penalties. In order to get the desired efficiency, modern computational techniques are used for modeling (UML -Unified Modeling Language) as well as for programming (Object- Oriented Programming). Case studies on a simple distribution network and on real Brazilian distribution systems are presented and discussed. © Copyright KTH 2006.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Utilizou-se o método seqüencial Monte Carlo / Mecânica Quântica para obterem-se os desvios de solvatocromismo e os momentos de dipolo dos sistemas de moléculas orgânicas: Uracil em meio aquoso, -Caroteno em Ácido Oléico, Ácido Ricinoléico em metanol e em Etanol e Ácido Oléico em metanol e em Etanol. As otimizações das geometrias e as distribuições de cargas foram obtidas através da Teoria do Funcional Densidade com o funcional B3LYP e os conjuntos de funções de base 6-31G(d) para todas as moléculas exceto para a água e Uracil, as quais, foram utilizadas o conjunto de funções de base 6-311++G(d,p). No tratamento clássico, Monte Carlo, aplicou-se o algoritmo Metropólis através do programa DICE. A separação de configurações estatisticamente relevantes para os cálculos das propriedades médias foi implementada com a utilização da função de auto-correlação calculada para cada sistema. A função de distribuição radial dos líquidos moleculares foi utilizada para a separação da primeira camada de solvatação, a qual, estabelece a principal interação entre soluto-solvente. As configurações relevantes da primeira camada de solvatação de cada sistema foram submetidas a cálculos quânticos a nível semi-empírico com o método ZINDO/S-CI. Os espectros de absorção foram obtidos para os solutos em fase gasosa e para os sistemas de líquidos moleculares comentados. Os momentos de dipolo elétrico dos mesmos também foram obtidos. Todas as bandas dos espectros de absorção dos sistemas tiveram um desvio para o azul, exceto a segunda banda do sistema de Beta-Caroteno em Ácido Oléico que apresentou um desvio para o vermelho. Os resultados encontrados apresentam-se em excelente concordância com os valores experimentais encontrados na literatura. Todos os sistemas tiveram aumento no momento de dipolo elétrico devido às moléculas dos solventes serem moléculas polares. Os sistemas de ácidos graxos em álcoois apresentaram resultados muito semelhantes, ou seja, os ácidos graxos mencionados possuem comportamentos espectroscópicos semelhantes submetidos aos mesmos solventes. As simulações através do método seqüencial Monte Carlo / Mecânica Quântica estudadas demonstraram que a metodologia é eficaz para a obtenção das propriedades espectroscópicas dos líquidos moleculares analisados.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monte Carlo simulations have been carried out to study the effect of temperature on the growth kinetics of a circular grain. This work demonstrates the importance of roughening fluctuations on the growth dynamics. Since the effect of thermal fluctuations is stronger in d =2 than in d =3, as predicted by d =3 theories of domain kinetics, the circular domain shrinks linearly with time as A (t)=A(0)-αt, where A (0) and A(t) are the initial and instantaneous areas, respectively. However, in contrast to d =3, the slope α is strongly temperature dependent for T≥0.6TC. An analytical theory which considers the thermal fluctuations agrees with the T dependence of the Monte Carlo data in this regime, and this model show that these fluctuations are responsible for the strong temperature dependence of the growth rate for d =2. Our results are particularly relevant to the problem of domain growth in surface science

Relevância:

100.00% 100.00%

Publicador:

Resumo:

MSC Subject Classification: 65C05, 65U05.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper present a methodology to choose the distribution networks reconfiguration that presents the lower power losses. The proposed methodology is based on statistical failure and repair data of the distribution power system components and uses fuzzy-probabilistic modeling for system component outage parameters. The proposed hybrid method using fuzzy sets and Monte Carlo simulation based on the fuzzyprobabilistic models allows catching both randomness and fuzziness of component outage parameters. A logic programming algorithm is applied, once obtained the system states by Monte Carlo Simulation, to get all possible reconfigurations for each system state. To evaluate the line flows and bus voltages and to identify if there is any overloading, and/or voltage violation an AC load flow has been applied to select the feasible reconfiguration with lower power losses. To illustrate the application of the proposed methodology, the paper includes a case study that considers a 115 buses distribution network.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical approaches to evaluate higher order SNP-SNP and SNP-environment interactions are critical in genetic association studies, as susceptibility to complex disease is likely to be related to the interaction of multiple SNPs and environmental factors. Logic regression (Kooperberg et al., 2001; Ruczinski et al., 2003) is one such approach, where interactions between SNPs and environmental variables are assessed in a regression framework, and interactions become part of the model search space. In this manuscript we extend the logic regression methodology, originally developed for cohort and case-control studies, for studies of trios with affected probands. Trio logic regression accounts for the linkage disequilibrium (LD) structure in the genotype data, and accommodates missing genotypes via haplotype-based imputation. We also derive an efficient algorithm to simulate case-parent trios where genetic risk is determined via epistatic interactions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A dissertation submitted in fulfillment of the requirements to the degree of Master in Computer Science and Computer Engineering

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monte-Carlo Tree Search (MCTS) is a heuristic to search in large trees. We apply it to argumentative puzzles where MCTS pursues the best argumentation with respect to a set of arguments to be argued. To make our ideas as widely applicable as possible, we integrate MCTS to an abstract setting for argumentation where the content of arguments is left unspecified. Experimental results show the pertinence of this integration for learning argumentations by comparing it with a basic reinforcement learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an adaptive Sequential Monte Carlo approach for real-time applications. Sequential Monte Carlo method is employed to estimate the states of dynamic systems using weighted particles. The proposed approach reduces the run-time computation complexity by adapting the size of the particle set. Multiple processing elements on FPGAs are dynamically allocated for improved energy efficiency without violating real-time constraints. A robot localisation application is developed based on the proposed approach. Compared to a non-adaptive implementation, the dynamic energy consumption is reduced by up to 70% without affecting the quality of solutions. © 2012 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Foram realizados quatro estudos de simulação para verificar a distribuição de inversas de variáveis com distribuição normal, em função de diferentes variâncias, médias, pontos de truncamentos e tamanhos amostrais. As variáveis simuladas foram GMD, com distribuição normal, representando o ganho médio diário e DIAS, obtido a partir da inversa de GMD, representando dias para se obter determinado peso. em todos os estudos, foi utilizado o sistema SAS® (1990) para simulação dos dados e para posterior análise dos resultados. As médias amostrais de DIAS foram dependentes dos desvios-padrão utilizados na simulação. As análises de regressão mostraram redução da média e do desvio-padrão de DIAS em função do aumento na média de GMD. A inclusão de um ponto de truncamento entre 10 e 25% do valor da média de GMD reduziu a média de GMD e aumentou a de DIAS, quando o coeficiente de variação de GMD foi superior a 25%. O efeito do tamanho dos grupos nas médias de GMD e DIAS não foi significativo, mas o desvio-padrão e CV amostrais médios de GMD aumentaram com o tamanho do grupo. em virtude da dependência entre a média e o desvio-padrão e da variação observada nos desvios-padrão de DIAS em função do tamanho do grupo, a utilização de DIAS como critério de seleção pode diminuir a acurácia da variação. Portanto, para a substituição de GMD por DIAS, é necessária a utilização de um método de análise robusto o suficiente para a eliminação da heterogeneidade de variância.